Dataset statistics
| Number of variables | 11 |
|---|---|
| Number of observations | 157 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 9.8 KiB |
| Average record size in memory | 63.8 B |
Variable types
| Numeric | 11 |
|---|
Year has constant value "2010" | Constant |
df_index is highly correlated with Month | High correlation |
Ozone is highly correlated with Temperature | High correlation |
Month is highly correlated with df_index | High correlation |
Temperature is highly correlated with Ozone | High correlation |
Weather_C is highly correlated with Weather_S | High correlation |
Weather_PS is highly correlated with Weather_S | High correlation |
Weather_S is highly correlated with Weather_C and 1 other fields | High correlation |
df_index is highly correlated with Month | High correlation |
Ozone is highly correlated with Wind and 1 other fields | High correlation |
Wind is highly correlated with Ozone | High correlation |
Month is highly correlated with df_index | High correlation |
Temperature is highly correlated with Ozone | High correlation |
Weather_C is highly correlated with Weather_S | High correlation |
Weather_PS is highly correlated with Weather_S | High correlation |
Weather_S is highly correlated with Weather_C and 1 other fields | High correlation |
df_index is highly correlated with Month | High correlation |
Ozone is highly correlated with Temperature | High correlation |
Month is highly correlated with df_index | High correlation |
Temperature is highly correlated with Ozone | High correlation |
Weather_C is highly correlated with Weather_S | High correlation |
Weather_PS is highly correlated with Weather_S | High correlation |
Weather_S is highly correlated with Weather_C and 1 other fields | High correlation |
df_index is highly correlated with Ozone and 3 other fields | High correlation |
Ozone is highly correlated with df_index and 3 other fields | High correlation |
Wind is highly correlated with Ozone and 1 other fields | High correlation |
Month is highly correlated with df_index and 2 other fields | High correlation |
Day is highly correlated with df_index and 1 other fields | High correlation |
Temperature is highly correlated with df_index and 4 other fields | High correlation |
Weather_C is highly correlated with Weather_PS and 1 other fields | High correlation |
Weather_PS is highly correlated with Weather_C and 1 other fields | High correlation |
Weather_S is highly correlated with Weather_C and 1 other fields | High correlation |
df_index is uniformly distributed | Uniform |
df_index has unique values | Unique |
Weather_C has 108 (68.8%) zeros | Zeros |
Weather_PS has 110 (70.1%) zeros | Zeros |
Weather_S has 96 (61.1%) zeros | Zeros |
Reproduction
| Analysis started | 2022-12-06 11:46:29.556350 |
|---|---|
| Analysis finished | 2022-12-06 11:46:43.827488 |
| Duration | 14.27 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
df_index
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIFORMUNIQUE| Distinct | 157 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 79.00636943 |
| Minimum | 1 |
|---|---|
| Maximum | 158 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 8.8 |
| Q1 | 40 |
| median | 79 |
| Q3 | 118 |
| 95-th percentile | 149.2 |
| Maximum | 158 |
| Range | 157 |
| Interquartile range (IQR) | 78 |
Descriptive statistics
| Standard deviation | 45.47717049 |
|---|---|
| Coefficient of variation (CV) | 0.5756139767 |
| Kurtosis | -1.198795274 |
| Mean | 79.00636943 |
| Median Absolute Deviation (MAD) | 39 |
| Skewness | 0.0008506369735 |
| Sum | 12404 |
| Variance | 2068.173036 |
| Monotonicity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 1 | 0.6% |
| 109 | 1 | 0.6% |
| 102 | 1 | 0.6% |
| 103 | 1 | 0.6% |
| 104 | 1 | 0.6% |
| 105 | 1 | 0.6% |
| 106 | 1 | 0.6% |
| 107 | 1 | 0.6% |
| 108 | 1 | 0.6% |
| 110 | 1 | 0.6% |
| Other values (147) | 147 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 |
| Value | Count | Frequency (%) |
| 158 | 1 | |
| 156 | 1 | |
| 155 | 1 | |
| 154 | 1 | |
| 153 | 1 | |
| 152 | 1 | |
| 151 | 1 | |
| 150 | 1 | |
| 149 | 1 | |
| 148 | 1 |
| Distinct | 67 |
|---|---|
| Distinct (%) | 42.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 39.19745223 |
| Minimum | 1 |
|---|---|
| Maximum | 168 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 9 |
| Q1 | 21 |
| median | 31 |
| Q3 | 45 |
| 95-th percentile | 97 |
| Maximum | 168 |
| Range | 167 |
| Interquartile range (IQR) | 24 |
Descriptive statistics
| Standard deviation | 28.78199212 |
|---|---|
| Coefficient of variation (CV) | 0.7342822169 |
| Kurtosis | 3.064808929 |
| Mean | 39.19745223 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 1.663989806 |
| Sum | 6154 |
| Variance | 828.4030704 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 31 | 39 | |
| 23 | 6 | 3.8% |
| 18 | 5 | 3.2% |
| 14 | 4 | 2.5% |
| 16 | 4 | 2.5% |
| 13 | 4 | 2.5% |
| 21 | 4 | 2.5% |
| 20 | 4 | 2.5% |
| 44 | 3 | 1.9% |
| 7 | 3 | 1.9% |
| Other values (57) | 81 |
| Value | Count | Frequency (%) |
| 1 | 1 | 0.6% |
| 4 | 1 | 0.6% |
| 6 | 1 | 0.6% |
| 7 | 3 | |
| 8 | 1 | 0.6% |
| 9 | 3 | |
| 10 | 1 | 0.6% |
| 11 | 3 | |
| 12 | 2 | |
| 13 | 4 |
| Value | Count | Frequency (%) |
| 168 | 1 | |
| 135 | 1 | |
| 122 | 1 | |
| 118 | 1 | |
| 115 | 1 | |
| 110 | 1 | |
| 108 | 1 | |
| 97 | 2 | |
| 96 | 1 | |
| 91 | 1 |
Solar
Real number (ℝ≥0)
| Distinct | 118 |
|---|---|
| Distinct (%) | 75.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 185.9745223 |
| Minimum | 7 |
|---|---|
| Maximum | 334 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.4 KiB |
Quantile statistics
| Minimum | 7 |
|---|---|
| 5-th percentile | 24.8 |
| Q1 | 127 |
| median | 199 |
| Q3 | 255 |
| 95-th percentile | 308.2 |
| Maximum | 334 |
| Range | 327 |
| Interquartile range (IQR) | 128 |
Descriptive statistics
| Standard deviation | 87.04478283 |
|---|---|
| Coefficient of variation (CV) | 0.468046815 |
| Kurtosis | -0.8318933218 |
| Mean | 185.9745223 |
| Median Absolute Deviation (MAD) | 61 |
| Skewness | -0.4438554159 |
| Sum | 29198 |
| Variance | 7576.794219 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 199 | 7 | 4.5% |
| 238 | 4 | 2.5% |
| 259 | 4 | 2.5% |
| 190 | 3 | 1.9% |
| 223 | 3 | 1.9% |
| 220 | 3 | 1.9% |
| 175 | 3 | 1.9% |
| 131 | 2 | 1.3% |
| 92 | 2 | 1.3% |
| 191 | 2 | 1.3% |
| Other values (108) | 124 |
| Value | Count | Frequency (%) |
| 7 | 1 | |
| 8 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 19 | 1 | |
| 20 | 1 | |
| 24 | 2 | |
| 25 | 1 | |
| 27 | 1 | |
| 31 | 1 |
| Value | Count | Frequency (%) |
| 334 | 1 | |
| 332 | 1 | |
| 323 | 1 | |
| 322 | 2 | |
| 320 | 1 | |
| 314 | 1 | |
| 313 | 1 | |
| 307 | 1 | |
| 299 | 1 | |
| 295 | 1 |
| Distinct | 31 |
|---|---|
| Distinct (%) | 19.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.929936306 |
| Minimum | 1.7 |
|---|---|
| Maximum | 20.7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.4 KiB |
Quantile statistics
| Minimum | 1.7 |
|---|---|
| 5-th percentile | 4.6 |
| Q1 | 7.4 |
| median | 9.7 |
| Q3 | 11.5 |
| 95-th percentile | 15.5 |
| Maximum | 20.7 |
| Range | 19 |
| Interquartile range (IQR) | 4.1 |
Descriptive statistics
| Standard deviation | 3.505187821 |
|---|---|
| Coefficient of variation (CV) | 0.3529919743 |
| Kurtosis | 0.114443177 |
| Mean | 9.929936306 |
| Median Absolute Deviation (MAD) | 2.3 |
| Skewness | 0.3650639762 |
| Sum | 1559 |
| Variance | 12.28634166 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=31)
| Value | Count | Frequency (%) |
| 11.5 | 15 | 9.6% |
| 8 | 12 | 7.6% |
| 7.4 | 11 | 7.0% |
| 10.3 | 11 | 7.0% |
| 9.7 | 11 | 7.0% |
| 6.9 | 10 | 6.4% |
| 6.3 | 8 | 5.1% |
| 9.2 | 8 | 5.1% |
| 10.9 | 8 | 5.1% |
| 8.6 | 8 | 5.1% |
| Other values (21) | 55 |
| Value | Count | Frequency (%) |
| 1.7 | 1 | 0.6% |
| 2.3 | 1 | 0.6% |
| 2.8 | 1 | 0.6% |
| 3.4 | 1 | 0.6% |
| 4 | 1 | 0.6% |
| 4.1 | 1 | 0.6% |
| 4.6 | 4 | |
| 5.1 | 3 | 1.9% |
| 5.7 | 3 | 1.9% |
| 6.3 | 8 |
| Value | Count | Frequency (%) |
| 20.7 | 1 | 0.6% |
| 20.1 | 1 | 0.6% |
| 18.4 | 1 | 0.6% |
| 16.6 | 3 | 1.9% |
| 16.1 | 1 | 0.6% |
| 15.5 | 3 | 1.9% |
| 14.9 | 8 | |
| 14.3 | 6 | |
| 13.8 | 5 | |
| 13.2 | 3 | 1.9% |
| Distinct | 5 |
|---|---|
| Distinct (%) | 3.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.01910828 |
| Minimum | 5 |
|---|---|
| Maximum | 9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 756.0 B |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 6 |
| median | 7 |
| Q3 | 8 |
| 95-th percentile | 9 |
| Maximum | 9 |
| Range | 4 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.434337771 |
|---|---|
| Coefficient of variation (CV) | 0.2043475771 |
| Kurtosis | -1.325691883 |
| Mean | 7.01910828 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.02068114319 |
| Sum | 1102 |
| Variance | 2.057324841 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=5)
| Value | Count | Frequency (%) |
| 9 | 33 | |
| 5 | 32 | |
| 7 | 31 | |
| 8 | 31 | |
| 6 | 30 |
| Value | Count | Frequency (%) |
| 5 | 32 | |
| 6 | 30 | |
| 7 | 31 | |
| 8 | 31 | |
| 9 | 33 |
| Value | Count | Frequency (%) |
| 9 | 33 | |
| 8 | 31 | |
| 7 | 31 | |
| 6 | 30 | |
| 5 | 32 |
| Distinct | 31 |
|---|---|
| Distinct (%) | 19.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.92993631 |
| Minimum | 1 |
|---|---|
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 8 |
| median | 16 |
| Q3 | 24 |
| 95-th percentile | 29.2 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 8.974404237 |
|---|---|
| Coefficient of variation (CV) | 0.5633672392 |
| Kurtosis | -1.225943132 |
| Mean | 15.92993631 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | -0.02655457059 |
| Sum | 2501 |
| Variance | 80.53993141 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=31)
| Value | Count | Frequency (%) |
| 1 | 6 | 3.8% |
| 29 | 6 | 3.8% |
| 27 | 6 | 3.8% |
| 26 | 6 | 3.8% |
| 17 | 5 | 3.2% |
| 30 | 5 | 3.2% |
| 28 | 5 | 3.2% |
| 25 | 5 | 3.2% |
| 24 | 5 | 3.2% |
| 23 | 5 | 3.2% |
| Other values (21) | 103 |
| Value | Count | Frequency (%) |
| 1 | 6 | |
| 2 | 5 | |
| 3 | 5 | |
| 4 | 5 | |
| 5 | 5 | |
| 6 | 5 | |
| 7 | 5 | |
| 8 | 5 | |
| 9 | 5 | |
| 10 | 5 |
| Value | Count | Frequency (%) |
| 31 | 3 | |
| 30 | 5 | |
| 29 | 6 | |
| 28 | 5 | |
| 27 | 6 | |
| 26 | 6 | |
| 25 | 5 | |
| 24 | 5 | |
| 23 | 5 | |
| 22 | 5 |
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2010 |
| Minimum | 2010 |
|---|---|
| Maximum | 2010 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.4 KiB |
Quantile statistics
| Minimum | 2010 |
|---|---|
| 5-th percentile | 2010 |
| Q1 | 2010 |
| median | 2010 |
| Q3 | 2010 |
| 95-th percentile | 2010 |
| Maximum | 2010 |
| Range | 0 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0 |
|---|---|
| Coefficient of variation (CV) | 0 |
| Kurtosis | 0 |
| Mean | 2010 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0 |
| Sum | 315570 |
| Variance | 0 |
| Monotonicity | Increasing |
Histogram with fixed size bins (bins=1)
| Value | Count | Frequency (%) |
| 2010 | 157 |
| Value | Count | Frequency (%) |
| 2010 | 157 |
| Value | Count | Frequency (%) |
| 2010 | 157 |
| Distinct | 40 |
|---|---|
| Distinct (%) | 25.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 77.74522293 |
| Minimum | 56 |
|---|---|
| Maximum | 97 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.4 KiB |
Quantile statistics
| Minimum | 56 |
|---|---|
| 5-th percentile | 60.6 |
| Q1 | 72 |
| median | 79 |
| Q3 | 84 |
| 95-th percentile | 92 |
| Maximum | 97 |
| Range | 41 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 9.405334315 |
|---|---|
| Coefficient of variation (CV) | 0.1209763631 |
| Kurtosis | -0.4128496196 |
| Mean | 77.74522293 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | -0.3447816805 |
| Sum | 12206 |
| Variance | 88.46031357 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=40)
| Value | Count | Frequency (%) |
| 81 | 11 | 7.0% |
| 76 | 10 | 6.4% |
| 82 | 9 | 5.7% |
| 77 | 8 | 5.1% |
| 86 | 7 | 4.5% |
| 78 | 6 | 3.8% |
| 79 | 6 | 3.8% |
| 67 | 5 | 3.2% |
| 73 | 5 | 3.2% |
| 80 | 5 | 3.2% |
| Other values (30) | 85 |
| Value | Count | Frequency (%) |
| 56 | 1 | 0.6% |
| 57 | 3 | |
| 58 | 2 | |
| 59 | 2 | |
| 61 | 3 | |
| 62 | 2 | |
| 63 | 1 | 0.6% |
| 64 | 2 | |
| 65 | 2 | |
| 66 | 3 |
| Value | Count | Frequency (%) |
| 97 | 1 | 0.6% |
| 96 | 1 | 0.6% |
| 94 | 2 | 1.3% |
| 93 | 3 | |
| 92 | 5 | |
| 91 | 2 | 1.3% |
| 90 | 3 | |
| 89 | 2 | 1.3% |
| 88 | 3 | |
| 87 | 5 |
Weather_C
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 2 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.3121019108 |
| Minimum | 0 |
|---|---|
| Maximum | 1 |
| Zeros | 108 |
| Zeros (%) | 68.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 285.0 B |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 1 |
| Range | 1 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.464833899 |
|---|---|
| Coefficient of variation (CV) | 1.489365758 |
| Kurtosis | -1.346749352 |
| Mean | 0.3121019108 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.8188842555 |
| Sum | 49 |
| Variance | 0.2160705537 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=2)
| Value | Count | Frequency (%) |
| 0 | 108 | |
| 1 | 49 |
| Value | Count | Frequency (%) |
| 0 | 108 | |
| 1 | 49 |
| Value | Count | Frequency (%) |
| 1 | 49 | |
| 0 | 108 |
Weather_PS
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 2 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2993630573 |
| Minimum | 0 |
|---|---|
| Maximum | 1 |
| Zeros | 110 |
| Zeros (%) | 70.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 285.0 B |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 1 |
| Range | 1 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.4594445944 |
|---|---|
| Coefficient of variation (CV) | 1.534740454 |
| Kurtosis | -1.233254014 |
| Mean | 0.2993630573 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.8846586028 |
| Sum | 47 |
| Variance | 0.2110893353 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=2)
| Value | Count | Frequency (%) |
| 0 | 110 | |
| 1 | 47 |
| Value | Count | Frequency (%) |
| 0 | 110 | |
| 1 | 47 |
| Value | Count | Frequency (%) |
| 1 | 47 | |
| 0 | 110 |
Weather_S
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 2 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.3885350318 |
| Minimum | 0 |
|---|---|
| Maximum | 1 |
| Zeros | 96 |
| Zeros (%) | 61.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 285.0 B |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 1 |
| Range | 1 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.488976974 |
|---|---|
| Coefficient of variation (CV) | 1.258514507 |
| Kurtosis | -1.809968786 |
| Mean | 0.3885350318 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.4617936296 |
| Sum | 61 |
| Variance | 0.2390984811 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=2)
| Value | Count | Frequency (%) |
| 0 | 96 | |
| 1 | 61 |
| Value | Count | Frequency (%) |
| 0 | 96 | |
| 1 | 61 |
| Value | Count | Frequency (%) |
| 1 | 61 | |
| 0 | 96 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| df_index | Ozone | Solar | Wind | Month | Day | Year | Temperature | Weather_C | Weather_PS | Weather_S | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 41.0 | 190.0 | 7.4 | 5 | 1 | 2010 | 67 | 0 | 0 | 1 |
| 1 | 2 | 36.0 | 118.0 | 8.0 | 5 | 2 | 2010 | 72 | 1 | 0 | 0 |
| 2 | 3 | 12.0 | 149.0 | 12.6 | 5 | 3 | 2010 | 74 | 0 | 1 | 0 |
| 3 | 4 | 18.0 | 313.0 | 11.5 | 5 | 4 | 2010 | 62 | 0 | 0 | 1 |
| 4 | 5 | 31.0 | 199.0 | 14.3 | 5 | 5 | 2010 | 56 | 0 | 0 | 1 |
| 5 | 6 | 28.0 | 199.0 | 14.9 | 5 | 6 | 2010 | 66 | 1 | 0 | 0 |
| 6 | 7 | 23.0 | 299.0 | 8.6 | 5 | 7 | 2010 | 65 | 0 | 1 | 0 |
| 7 | 8 | 19.0 | 99.0 | 13.8 | 5 | 8 | 2010 | 59 | 1 | 0 | 0 |
| 8 | 9 | 8.0 | 19.0 | 20.1 | 5 | 9 | 2010 | 61 | 0 | 1 | 0 |
| 9 | 10 | 31.0 | 194.0 | 8.6 | 5 | 10 | 2010 | 69 | 0 | 0 | 1 |
Last rows
| df_index | Ozone | Solar | Wind | Month | Day | Year | Temperature | Weather_C | Weather_PS | Weather_S | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 147 | 148 | 14.0 | 20.0 | 16.6 | 9 | 25 | 2010 | 63 | 0 | 1 | 0 |
| 148 | 149 | 30.0 | 193.0 | 6.9 | 9 | 26 | 2010 | 70 | 1 | 0 | 0 |
| 149 | 150 | 31.0 | 145.0 | 13.2 | 9 | 27 | 2010 | 77 | 0 | 1 | 0 |
| 150 | 151 | 14.0 | 191.0 | 14.3 | 9 | 28 | 2010 | 75 | 0 | 0 | 1 |
| 151 | 152 | 18.0 | 131.0 | 8.0 | 9 | 29 | 2010 | 76 | 0 | 1 | 0 |
| 152 | 153 | 20.0 | 223.0 | 11.5 | 9 | 30 | 2010 | 68 | 0 | 0 | 1 |
| 153 | 154 | 41.0 | 190.0 | 7.4 | 5 | 1 | 2010 | 67 | 1 | 0 | 0 |
| 154 | 155 | 30.0 | 193.0 | 6.9 | 9 | 26 | 2010 | 70 | 0 | 1 | 0 |
| 155 | 156 | 31.0 | 145.0 | 13.2 | 9 | 27 | 2010 | 77 | 0 | 0 | 1 |
| 156 | 158 | 18.0 | 131.0 | 8.0 | 9 | 29 | 2010 | 76 | 1 | 0 | 0 |